murder mystery game
- Law (1.00)
- Information Technology (1.00)
- Leisure & Entertainment > Games > Computer Games (0.46)
WhodunitBench: Evaluating Large Multimodal Agents via Murder Mystery Games
Recently, large language models (LLMs) have achieved superior performance, empowering the development of large multimodal agents (LMAs). An LMA is anticipated to execute practical tasks requires various capabilities including multimodal perception, interaction, reasoning, and decision making. However, existing benchmarks are limited in assessing compositional skills and actions demanded by practical scenarios, where they primarily focused on single tasks and static scenarios. To bridge this gap, we introduce WhodunitBench, a benchmark rooted from murder mystery games, where players are required to utilize the aforementioned skills to achieve their objective (i.e., identifying the `murderer' or hiding themselves), providing a simulated dynamic environment for evaluating LMAs. Specifically, WhodunitBench includes two evaluation modes.
- Law (1.00)
- Information Technology (1.00)
- Leisure & Entertainment > Games > Computer Games (0.46)
WhodunitBench: Evaluating Large Multimodal Agents via Murder Mystery Games
Recently, large language models (LLMs) have achieved superior performance, empowering the development of large multimodal agents (LMAs). An LMA is anticipated to execute practical tasks requires various capabilities including multimodal perception, interaction, reasoning, and decision making. However, existing benchmarks are limited in assessing compositional skills and actions demanded by practical scenarios, where they primarily focused on single tasks and static scenarios. To bridge this gap, we introduce WhodunitBench, a benchmark rooted from murder mystery games, where players are required to utilize the aforementioned skills to achieve their objective (i.e., identifying the murderer' or hiding themselves), providing a simulated dynamic environment for evaluating LMAs. Specifically, WhodunitBench includes two evaluation modes. The first mode, the arena-style evaluation, is constructed from 50 meticulously curated scripts featuring clear reasoning clues and distinct murderers; The second mode, the chain of evaluation, consists of over 3000 curated multiple-choice questions and open-ended questions, aiming to assess every facet of the murder mystery games for LMAs.
Own a ‘smart speaker’? Your voice also transforms it into a fun gaming platform
Coming soon to Alexa speakers, X2 Games and Atari founder Nolan Bushnell will introduce'St. If you own an Amazon Echo, Google Home or other "smart speaker," you're likely aware you can use your voice to play music, order a product, and control your smart home gadgets. But you might not know you can also use your voice to play games, whether you're home alone or with family or friends. "There has never been a more natural way to communicate with technology than using your voice," says Katherine Prescott, founder and editor of VoiceBrew, a digital media company dedicated to helping people get the most out of Alexa, with articles, blog posts, and email newsletters. What dog is the one for you?: How I Met My Dog will tell you RCA's 100th anniversary: How a Russian immigrant changed our communication methods forever Smart speaker usage is growing.
- North America > United States > New York (0.05)
- North America > United States > California > San Diego County > San Diego (0.05)
- Information Technology (1.00)
- Leisure & Entertainment > Games > Computer Games (0.66)